A Stability-based Validation Procedure for Differentially Private Machine Learning
نویسندگان
چکیده
Differential privacy is a cryptographically motivated definition of privacy which has gained considerable attention in the algorithms, machine-learning and datamining communities. While there has been an explosion of work on differentially private machine learning algorithms, a major barrier to achieving end-to-end differential privacy in practical machine learning applications is the lack of an effective procedure for differentially private parameter tuning, or, determining the parameter value, such as a bin size in a histogram, or a regularization parameter, that is suitable for a particular application. In this paper, we introduce a generic validation procedure for differentially private machine learning algorithms that apply when a certain stability condition holds on the training algorithm and the validation performance metric. The training data size and the privacy budget used for training in our procedure is independent of the number of parameter values searched over. We apply our generic procedure to two fundamental tasks in statistics and machine-learning – training a regularized linear classifier and building a histogram density estimator that result in end-toend differentially private solutions for these problems.
منابع مشابه
Differentially Private Learning with Kernels
In this paper, we consider the problem of differentially private learning where access to the training features is through a kernel function only. As mentioned in Chaudhuri et al. (2011), the problem seems to be intractable for general kernel functions in the standard learning model of releasing different private predictor. We study this problem in three simpler but practical settings. We first...
متن کاملDifferentially Private Model Selection via Stability Arguments and the Robustness of the Lasso
We design differentially private algorithms for statistical model selection. Given a data set and a large, discrete collection of “models”, each of which is a family of probability distributions, the goal is to determine the model that best “fits” the data. This is a basic problem in many areas of statistics and machine learning. We consider settings in which there is a well-defined answer, in ...
متن کاملDifferentially Private Algorithms for Empirical Machine Learning
An important use of private data is to build machine learning classifiers. While there is a burgeoning literature on differentially private classification algorithms, we find that they are not practical in real applications due to two reasons. First, existing differentially private classifiers provide poor accuracy on real world datasets. Second, there is no known differentially private algorit...
متن کاملPractical Differential Privacy in High Dimensions
Privacy-preserving, and more concretely differentially private machine learning, is concerned with hiding specific details in training datasets which contain sensitive information. Many proposed differentially private machine learning algorithms have promising theoretical properties, such as convergence to non-private performance in the limit of infinite data, computational efficiency, and poly...
متن کاملPREDICTION OF SLOPE STABILITY STATE FOR CIRCULAR FAILURE: A HYBRID SUPPORT VECTOR MACHINE WITH HARMONY SEARCH ALGORITHM
The slope stability analysis is routinely performed by engineers to estimate the stability of river training works, road embankments, embankment dams, excavations and retaining walls. This paper presents a new approach to build a model for the prediction of slope stability state. The support vector machine (SVM) is a new machine learning method based on statistical learning theory, which can so...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013